Skip to content

best effort tito#955

Open
eligotts wants to merge 4 commits intomainfrom
eli/best-effort-tito
Open

best effort tito#955
eligotts wants to merge 4 commits intomainfrom
eli/best-effort-tito

Conversation

@eligotts
Copy link
Contributor

@eligotts eligotts commented Feb 24, 2026

Description

Best effort tito. instead of assuming extension and looking back at strictly last trajectory step, we walk backward until we find a MESSAGES level prefix hit

tested on both wiki-search and eligottlieb/poker-multiagent, which was giving the loud failure previously with TITO as it does explicit rewriting of history (like context folding)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Medium Risk
Changes how token-stitching selects prior-turn tokens and adds a new fallback path; mistakes could cause silent behavior changes (more MITO calls) or incorrect token prefixes in production.

Overview
Makes token-stitching (TITO) best-effort: get_prompt_ids now scans the trajectory backwards to find the largest message-level prefix match (normalizing message structures) instead of assuming the last step matches, and returns None when no match exists.

Updates get_native_response to fall back to the standard /chat/completions path when get_prompt_ids returns None, avoiding broken /chat/completions/tokens requests on history rewrites/context folding. Adds async tests covering largest-prefix selection, no-prefix behavior, and ensuring the correct route is used depending on whether prompt token IDs are available.

Written by Cursor Bugbot for commit 6ae6aff. This will update automatically on new commits. Configure here.

@eligotts eligotts changed the title best effort tito, we look back in the trajectory list for last step w… best effort tito Feb 24, 2026
Copy link
Member

@mikasenghaas mikasenghaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeap, this works!

prev_turn_completion_ids = prev_turn_tokens["completion_ids"]
prev_turn_ids = prev_turn_prompt_ids + prev_turn_completion_ids

def normalize_for_comparison(value: Any) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make this a general message_util? seems useful in other places too? also, vaguely remember we have a similar util to this alr but might be wrong


return 0

# we add suffix_ids to prev_turn_ids. suffix_ids are tokens that are added
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know unrelated to this pr but can i think we might be able to remove the suffix part since we tokenize the env_response_ids = full_ids[len(prev_turn_ids) :] and not tokenize env response ids in isolation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
best_step_tokens = step_tokens
if best_prefix_len == len(normalized_prompt_messages):
break

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-turn backward scan may be expensive

Medium Severity

get_prompt_ids() now walks backward over the entire state["trajectory"] and calls to_native_prompt() per step until it finds the best prefix, which can add significant overhead on long trajectories and slow every generation turn.

Fix in Cursor Fix in Web

if best_step_tokens is None:
return None
return best_step_tokens["prompt_ids"] + best_step_tokens["completion_ids"]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix match can miss equivalent messages

Medium Severity

get_prompt_ids’s new message-level prefix matcher compares normalized message objects for strict equality, which can differ across representations (e.g., to_native_prompt emitting {"content": None} while incoming prompt_messages omits content, or other default/extra fields). This can produce false “no prefix match”, disabling the token route unexpectedly.

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.


prev_turn_ids = await find_largest_prefix_match_tokens()
if prev_turn_ids is None:
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix match ignores tool-dependent tokenization

Medium Severity

find_largest_prefix_match_tokens() selects a trajectory step using only a message-level prefix comparison, but the stitched prev_turn_ids are later combined with full_ids produced by tokenize(..., tools=oai_tools). If the effective tool set differs from when the matched step’s tokens were produced, prev_turn_ids may not align with full_ids, yielding incorrect env_response_ids and an invalid prompt for /chat/completions/tokens.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants